Client Report - Famous Names

Unit 1 Task 3

Author

Daniel Watts

Show the code
import pandas as pd
import numpy as np
from lets_plot import *

LetsPlot.setup_html(isolated_frame=True)
Show the code
# Learn morea about Code Cells: https://quarto.org/docs/reference/cells/cells-jupyter.html

# Include and execute your code here
df = pd.read_csv("https://github.com/byuidatascience/data4names/raw/master/data-raw/names_year/names_year.csv")

QUESTION 1

Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice? You must provide a chart. The years labels on your charts should not include a comma.

The use of biblical names was far more common at the start of the 20th century, and then peaked at around 1940, after world war II. It crashed soon after in the 70s and there have been far fewer biblical names since.

Show the code
# Q1
ggplot(data=df.query("name == ['Martha', 'Peter', 'Mary', 'Paul']"),mapping=aes(x='year',y='Total',color='name'))+ geom_line()+scale_x_continuous(limits=[1920, 2000], format='d')

QUESTION 2

  1. Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage? You must provide a chart. The years labels on your charts should not include a comma.

Tiffany first started to be used as a name around the same time as the release of the movie, the steadily rose until it peaked in the 80s.

Show the code
# Q2
ggplot(data=df.query("name == 'Tiffany'"),mapping=aes(x='year',y='Total')) + geom_line()+scale_x_continuous(limits=[1955, 2015], format='d') + geom_vline(xintercept=1961, linetype=5, color='blue')+ geom_text(x = 1960, y = 8000, label = "Breakfast at Tiffany's Release Date",angle = 90,color='blue')+geom_text(x=1985,y=16300,label="Peak Tiffany in 1988")